Bi-correlation clustering algorithm for determining a set of co-regulated genes

نویسندگان

  • Anindya Bhattacharya
  • Rajat K. De
چکیده

MOTIVATION Biclustering has been emerged as a powerful tool for identification of a group of co-expressed genes under a subset of experimental conditions (measurements) present in a gene expression dataset. Several biclustering algorithms have been proposed till date. In this article, we address some of the important shortcomings of these existing biclustering algorithms and propose a new correlation-based biclustering algorithm called bi-correlation clustering algorithm (BCCA). RESULTS BCCA has been able to produce a diverse set of biclusters of co-regulated genes over a subset of samples where all the genes in a bicluster have a similar change of expression pattern over the subset of samples. Moreover, the genes in a bicluster have common transcription factor binding sites in the corresponding promoter sequences. The presence of common transcription factors binding sites, in the corresponding promoter sequences, is an evidence that a group of genes in a bicluster are co-regulated. Biclusters determined by BCCA also show highly enriched functional categories. Using different gene expression datasets, we demonstrate strength and superiority of BCCA over some existing biclustering algorithms. AVAILABILITY The software for BCCA has been developed using C and Visual Basic languages, and can be executed on the Microsoft Windows platforms. The software may be downloaded as a zip file from http://www.isical.ac.in/ approximately rajat. Then it needs to be installed. Two word files (included in the zip file) need to be consulted before installation and execution of the software. CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Gene expression module discovery using gibbs sampling.

Recent advances in high throughput profiling of gene expression have catalyzed an explosive growth in functional genomics aimed at the elucidation of genes that are differentially expressed in various tissue or cell types across a range of experimental conditions. These studies can lead to the identification of diagnostic genes, classification of genes into functional categories, association of...

متن کامل

Clustering of a Number of Genes Affecting in Milk Production using Information Theory and Mutual Information

Information theory is a branch of mathematics. Information theory is used in genetic and bioinformatics analyses and can be used for many analyses related to the biological structures and sequences. Bio-computational grouping of genes facilitates genetic analysis, sequencing and structural-based analyses. In this study, after retrieving gene and exon DNA sequences affecting milk yield in dairy ...

متن کامل

A cross-species bi-clustering approach to identifying conserved co-regulated genes

MOTIVATION A growing number of studies have explored the process of pre-implantation embryonic development of multiple mammalian species. However, the conservation and variation among different species in their developmental programming are poorly defined due to the lack of effective computational methods for detecting co-regularized genes that are conserved across species. The most sophisticat...

متن کامل

Average correlation clustering algorithm (ACCA) for grouping of co-regulated genes with similar pattern of variation in their expression values

Distance based clustering algorithms can group genes that show similar expression values under multiple experimental conditions. They are unable to identify a group of genes that have similar pattern of variation in their expression values. Previously we developed an algorithm called divisive correlation clustering algorithm (DCCA) to tackle this situation, which is based on the concept of corr...

متن کامل

Mining co-regulated gene profiles for the detection of functional associations in gene expression data

MOTIVATION Association pattern discovery (APD) methods have been successfully applied to gene expression data. They find groups of co-regulated genes in which the genes are either up- or down-regulated throughout the identified conditions. These methods, however, fail to identify similarly expressed genes whose expressions change between up- and down-regulation from one condition to another. In...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 25 21  شماره 

صفحات  -

تاریخ انتشار 2009